Characterizing the Heterogeneity of the OpenStreetMap Data and Community
نویسندگان
چکیده
OpenStreetMap (OSM) constitutes an unprecedented, free, geographical information source contributed by millions of individuals, resulting in a database of great volume and heterogeneity. In this study, we characterize the heterogeneity of the entire OSM database and historical archive in the context of big data. We consider all users, geographic elements and user contributions from an eight-year data archive, at a size of 692 GB. We rely on some nonlinear methods such as power law statistics and head/tail breaks to uncover and illustrate the underlying scaling properties. All three aspects (users, elements, and contributions) demonstrate striking power laws or heavy-tailed distributions. The heavy-tailed distributions imply that there are far more small elements than large ones, far more inactive users than active ones, and far more lightly edited elements than heavy-edited ones. Furthermore, about 500 users in the core group of the OSM are highly networked in terms of collaboration.
منابع مشابه
Comparative Analysis of Intra-and Inter Populational Heterogeneity of the Essential Oils in White Savory Plants
White savory(Satureja mutica Fisch & C.A.Mey.) is one of the most widely used medicinal plants in food processing, pharmaceutical and cosmetic industry due to the strongly scented and presence of phenolic compounds such as carvacrol and thymol. This experiment was carried out to evaluate the levels of inter and intra-populations variability of essential oil compositions of S. mutica grown in no...
متن کاملComparative Spatial Analysis of Positional Accuracy of OpenStreetMap and Proprietary Geodata
The emergence and ubiquitary availability of geotechnologies yield a rapid increase of user generated geographical data, utilized for mapping, modeling etc. On the example of a well mapped German city this paper analyzes the positional accuracy of OpenStreetMap and TomTom data by means of a statistical comparative approach using official survey data as the reference dataset. The results show th...
متن کاملUsing Regression based Control Limits and Probability Mixture Models for Monitoring Customer Behavior
In order to achieve the maximum flexibility in adaptation to ever changing customer’s expectations in customer relationship management, appropriate measures of customer behavior should be continually monitored. To this end, control charts adjusted for buyer’s/visitor’s prior intention to repurchase or visit again are suitable means taking into account the heterogeneity across customers. In the ...
متن کاملOpenStreetMap: Quality assessment of Brazil's collaborative geographic data over ten years
OpenStreetMap is a collaborative mapping tool in which users actively include, transform and exclude geographic data. Consequently, the quality and consistency of the information made available in the tool is of constant concern. To address this issue, this work performs an analysis of some of the quality parameters within OpenStreetMap, with the data referring to the region corresponding to Br...
متن کاملThe generation of fuzzy sets and the~construction of~characterizing functions of~fuzzy data
Measurement results contain different kinds of uncertainty. Besides systematic errors andrandom errors individual measurement results are also subject to another type of uncertainty,so-called emph{fuzziness}. It turns out that special fuzzy subsets of the set of real numbers $RR$are useful to model fuzziness of measurement results. These fuzzy subsets $x^*$ are called emph{fuzzy numbers}. The m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- ISPRS Int. J. Geo-Information
دوره 4 شماره
صفحات -
تاریخ انتشار 2015